(19) 



J 



Europaisches Palentamt 
European Patent Office 
Office europeen des brevets 




(12) 



(43) Date of publication: 

09.02.2000 Bulletin 2000/06 

(21) Applicatbn number: 99305916.1 

(22) Date of filing: 26.07.1999 



(11) EP 0 978 781 A2 

EUROPEAN PATENT APPLICATION 

(51) lntCl7: G06F 1/32 



(84) 


Designated Contracting States: 


(72) 


inventors: 




AT BE CH CY DE DK ES Fl PR GB GR IE IT LI LU 


• 


Nicol, Christopher John 




MC NL PT SE 




Springwood, N.S.W. (AU) 




Designated Extension States: 


• 


Singh, Kanwar Jit 




AL LT LV MK RO SI 




Hazlet, New Jersey 07730 (US) 


(30) 


Priority: 03.08.1998 US 128030 


(74) 


Representative: 








Buckley, Christopher Simon Thirsk et a! 


(71) 


Applicant: LUCENT TECHNOLOGIES INC. 




Lucent Technologies (UK) Ltd, 




Murray Hill, New Jersey 07974-0636 (US) 




5 Mornlngton Road 








Woodford Green, Essex IG8 OTU (GB) 



(54) Power reduction in a multiprocessor digital signal processor 



(57) Improved operation of multi-processor chips is 
achieved by dynamically controlling processing load of 
chips and controlling, significantly greater than on/off 
granularity, the operating voltages of those chips so as 
to minimize overall power consumption. A controller in 
a multi-processor chip allocates tasks to the individual 
processors to equalize processing load among the 



chips, then the controller lowers the cbck frequency on 
the chip to as low a level as possible while assuring 
proper operation, and finally reduces the supply voltage. 
Further improvement is possible by controlling the sup- 
ply voltage of individual processing elements within the 
multi-processor chip, as well as controlling the supply 
voltage of other elements in the system within which the 
multi-processor chip operates. 
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Description 

Background 

[0001] This invention relates to electronic circuits and. s 
more particularly to power consumption wrthin electron- 
ic circuits. 

[0002] Integrated circuits are designed to nneet speed 
requirements under worst-case operating conditions. In 
Lucent Technology's 0.35M.nn 3.3V CMOS technology, io 
the "worst-case-slow" condition is specified for a tem- 
perature of 1 25C and a chip supply voltp.gej V^^ of 2.7V 
The worst-case power consumptbn of the chip is quoted 
at the maximum supply voltage of 3.6V. The difference 
in chip performance at the "worst-case slow", nominal, ^5 
and "worst-case-tast" condrtions is shown in FIG. 1, 
where the frequency of a 25-stage ring oscillator is 
shown at different supply voltages and process corners. 
At the nominal operating voltage of 3.3V, the speed dif- 
ference between "worst case slow" (WCS) and "worst 
case fast" (WCF) is a factor of 2.2. From the graph it can 
be seen that if a chip is designed to operate at 140MHz 
and at 2.1 V supply even when it is "worst -case-slow", a 
manufactured chip whose characteristics happen to be 
nominal will continue to operate at 140MHz even when ^5 
the chip supply is reduced to 2.1V. 
[0003] The power consumption of a CMOS circuit in- 
creases linearly with operating frequency and quadrat- 
ically with supply voltage. Therefore, a reduction in sup- 
ply voltage can significantly reduce power consumption. 30 
For example, by reducing the nominal operating voltage 
from 3.3V to 2.1 V, the nominal power consumption of 
a 140MHz chip is reduced by 60% without altering the 
circuit. This, of course, presumes an ability to identify 
and measure a chip's variation from nominal character- 3S 
istics. and an ability to modify the supply voltage based 
on this measurement. 

[0004] To achieve variable power supply voltage scal- 
ing, a programmable dc-dc converter may be used. 
Probably, the most efficient approach in use today is the 40 
buck converter circuit. These are well known in the art. 
[0005] Voltage scaling as a function of temperature 
has been incorporated into the Intel Pentium product 
family as a technique to achieve high performance at 
varying operating temperatures and process comers. It 
is described In US Patent No. 5.440.520. The approach 
uses an on-chip temperature sensor and associated 
processing circuitry which issues a code to the off-chip 
power supply to provide a particular supply voltage. The 
process variation informatk)n is hard-coded into each 50 
device as a final step of manufacturing. This approach 
has the disadvantage of costly testing of each chip to 
determine its variance from nominal processing. Sever- 
al manufacturers make Pentium-compatible dc-dc con- 
verter circuits, which are highlighted in "Powering the ss 
Big Microprocessors", by B. Travis, EDN, August 1 5, pp. 
31-44. igg7. 

[0006] Recently, there has been considerable interest 



in integrating much of the buck controller circuit onto the 
chip. The only off-chip components are the inductor (typ- 
ically about lO^H) and capacitor (typically about 30^F) 
used in the buck converter. Efficiencies in excess of 80% 
are typical for a range of voltages and load currents. 
See. for example, "A High-Efficiency Variable Voltage 
CMOS Dynamic dc-dc Switching Regulator." by W. 
Namgoong. M. Yu, and T. Meng. Proceedings ISSCC97 
pp. 380-381, February, 1997. Researchers have been 
also experimenting with on-chip voltage scaling tech- 
niques to counter process and temperature variations. 
See "Variable Supply-Voltage Scheme for Low Power 
High-Speed COMS Digital Design," by T. Kuroda et al, 
CICC97 Conference Proceedings, and JSSC Issue of 
CISS97, May. 1998. The Kuroda et al paper demon- 
strates that the speed of the circuit can be maintained 
(or at least the speed degradation can be minimized) by 
tuning the threshold voltages even as the supply voltage 
is lowered. The tuning is achieved on-chip by varying 
the substrate-bias voltage. These techniques are need- 
ed to ensure that the leakage current, which increasing 
as the threshold voltage is reduced, does not become 
too large. 

[0007] Thus, it is known that varying supply voltage to 
a chip can improve performance by eliminating unex- 
pected variability in the supply voltage, and by account- 
ing for process and operating temperature variations. 

Summary of the Invention 

[0008] Improved performance of multi-processor 
chips is achieved by dynamically controlling the 
processing load of chips and controlling, which signifi- 
cantly greater than on/off granularity, the operating volt- 
ages of those chips so as to minimize overall power con- 
sumption. A controller in a multi-processor chip allo- 
cates tasks to the individual processors to equalize 
processing load among the chips, then the controller 
lowers the clock frequency on the chip to as tow a level 
as possible while assuring proper operation, and finally 
reduces the supply voltage. Further improvement is 
possible by controlling the supply voltage of individual 
processing elements within the multi-processor chip, as 
well as controlling the supply voltage of other elements 
in the system within which the multi-processor chip op- 
erates. 

Brief Description of the Drawings 
[0009] 

FIG. 1 illustrates the maximum operating frequency 
that is achievable with a 0.35^m technology CMOS 
chip as a function of supply voltage; 

FIG. 2 presents a block diagram of a multi-proces- 
sor chip with supply voltage control in accordance 
with the principles disclosed herein; 
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FIG. 3 shows the relationship between the voltage 
control clock. Clk. of FIG. 2, the clock applied to the 
processing elements of FIG. 2. Clk-L. and the sup- 
ply voltage applied to the processing elements. 
•local; and 

FIG. 4 depicts the block diagram of a multi-proces- 
sor chip with supply voltage control that is individual 
to each of the processing elements. 

Detailed Description 

[0010] FIG. 2 depicts a block diagram of a multi-proc- 
essor chip. It contains processing elements (PEs) 100, 
101, 102, 103, ... 104, and each PE contains a central 
processing unit (CPU) and a local cache memory (not 
shown). A real-time operating system resides in PE 100 
and allocates tasks to the other PEs from a mix of many 
digital signal processing applications. The load ofthe 
FIG. 2 system is time varying and is dependent on the 
applications that are being executed at any given time. 
For example, a set-top-box for a multimedia broadband 
access system might need to receive an HDTV signal. 
It could also be transmitting data from a computer, to the 
Internet, and responding to button requests from a re- 
mote control handset. Over time, this dynamic mix of 
applications places different load requirements on the 
system. 

[001 1] For a maximally utilized system, all of the avail- 
able processors ought to be operating at full speed when 
satisfying the maximum load encountered by the sys- 
tem. At such a time, the power consumption of the mul- 
tiprocessor chip is at its maximum level. However, as 
the load requirements are lowered, the system should, 
advantageously, reduce its power consumptbn. It may 
be noted that, typically, computers spend 99% of their 
time waiting for a user to press a key This presents a 
great opportunity to drastically reduce the average pow- 
er consumption. The specific approach by which the 
system "scales back' its performance can greatly im- 
pact the realizable power savings. 
[0012] In the FIG. 2 arrangement, in accordance with 
the principles disclosed herein, the applications that 
need to be processed are mapped to the N PEs under 
control of real time operating system (RTOS) executed 
on PE 1 00. If the number of instructions that need to be 
executed for each task is known and made available to 
the operating system, a scheduler within the operating 
system can use this information to determine the best 
way to allocate the tasks to the available processors in 
order to balance the computatbn. The intermediate 
goal, of course, is to maximize the parallelism and to 
evenly distribute the load presented to the FIG. 2 system 
among all of the PE*6. 

[0013] When an application that is running on the FIG. 
2 system is subdivided into N concurrent task streams, 
as suggested above, each of the PEs become lightly 
loaded. This allows the clock frequency of the PEs to be 



reduced, and if the task division can be carried out per- 
fectly then the clock frequency of the FIG. 2 system can 
be reduced by a factor of N. Reducing the frequency as 
indicated above, allows reducing the necessary supply 
5 voltage, and reducing the supply voltage reduces the 
system's power consumption (quadratically). To illus- 
trate, if a given application that is executed on 1 PE re- 
quires operating the PE at 140MHz, it is known from 
FIG. 1 that the PE can be operated at approximately a 
w 2.7V supply. When the application is divided into two 
concurrent tasks and assigned to two PEs that are de- 
signed to operate at 140f\/IHz from a 2.7V supply, then 
the PEs can be operated at 70 MHz and at a supply volt- 
age of 1.8V. This reduction in operating voltage repre- 
ss sents a power saving of 55%. Of course, it is unlikely 
that an application can be perfectly divided into two 
equal load task streams and, therefore, the 55% power 
saving is the maximum achievable power saving for two 
PEs. 

20 [0014] It should be understood that in the above ex- 
ample, when two PEs are employed and their operating 
frequency can be reduced to 70 MHz, the indicated re- 
duction presumes that it is desired to perform the given 
tasks as if there was a single PE that operates at 

2S 140MHz. That is, the presumption is that there is a cer- 
tain time when the tasks assigned to the chip must be 
finished. In fact, there might not be any particular re- 
quirement for when the tasks are to be finished. Alter- 
natively, a requirement for when the tasks are to be fin- 

30 ished might not be related to the highest operating fre- 
quency of the chip. 

[0015] For example, the above-illustrated chip (where 
each of the PEs is designed to operate at 140 MHz) 
might be employed in a system whose basic frequency 

3S is related to 160 MHz. In such an arrangement, dividing 
tasks between the two PEs of the chip and operating 
each of the PEs at 80MHz wou kJ be preferable because 
it would be easier to synchronize the chip's input and 
output functions to the other elements in the system. 

40 Thus, In a sense it is the expected completion time for 
the collection of assigned tasks that is controlling, and 
the reduction of frequency from the maximum that the 
chip can support may be controlled by the division of 
tasks that may be accomplished. 

45 [0016] Hence, the operating system of PE 1 00 needs 
to ascertain the required completion time, divide the col- 
lection of tasks as evenly as possible (in terms of need- 
ed processing time), consider the PE with the tasks that 
require the most time to carry out, and adjust the clock 

BO frequency to insure that the most heavily loaded PE car- 
ries out its assigned tasks within the required completion 
time. Once the frequency is thus determined, a mini- 
mum supply voltage can be determined. The supply 
voltage determination can be made by reference to a 

BS plot like the one shown in FIG. 1 or, advantageously, by 
evaluating the actual performance of the multiprocessor 
at hand. 

[0017] As indicated above, the operating system can 
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reduce the supply voltage even further by tracking tem- 
perature and process variations. For example, when the 
chip is nominal in its characteristics, then it can be op- 
erated along line 20 of FIG. 1 . which calls for only 1 .5V 
supply when operating at 70MHz. 
[0018] Returning the discussion to FIG. 2, the pro- 
grammable-frequency clock is generated using an ap- 
propriately multiplied input reference clock (line 101 ) via 
a phase lock kxjp frequency synthesizer circuit 110 
which has a high resolution, e.g., can be altered in in- 
crements of 5MHz. Advantageously, two clocks are gen- 
erated by PLL 110 (requiring two synthesizer circuits), 
a Clk cbck, and a Clk-L which is 1 frequency step lower 
than Clk when Clk is being increased. For example, in 
a PLL 110 unit that provides 5MHz resolution, when Clk 
is being increased from 75 MHz to 80MHz, the value of 
Clk-L is set to 75MHz. 

[0019] Clk-L is applied to the PEs, while Clk is applied 
to calibration circuit 120. which generates a supply volt- 
age command. The supply voltage command is applied 
to dc-dc converter 130 followed by L-C circuit 140 to 
cause the combination of converter 130 and L-C circuit 
140 to create the supply voltage V^'\ocal which is fed 
back to calibration circuit 120 via line 102. The -local 
supply voltage is also applied to all of the PEs (excluding 
perhaps the operating system PE 100). 
[0020] The reason for having the frequency Clk-L lag 
behind the frequency Clk is that the clock frequency ap- 
plied to the PEs should not be increased prior to the sup- 
ply voltage being increased to accommodate the higher 
frequency. Othenrt^ise, the PEs might fail to perform 
properly. Circuit 120 observes the level on line 102 to 
determine whether it corresponds to the voltage neces- 
sary to make PEs 100-104 operate properly (described 
below), and it also waits till the signal on line 102 is sta- 
ble (following whatever ringing occurs at the output of 
L-C circuit 140. The signal on line 121 provides informa- 
tion to PE 100 (yes/no) to inform the operating system 
of when the supply voltage is stable. When the voltage 
is stable and Clk has reached the required frequency, 
the operating system sets Clk-L to Clk and then changes 
the task allocation on the PEs to correspond to that 
which the PEs were set up to accommodate. 
[0021] FIG. 3 demonstrates the timing associated 
with increasing Clk. Clk-L and \/^local when a new task 
is created and the load on the multiprocessor is thus In- 
creased, and the timing associated with decreasing Clk, 
Clk-L and V^local when the load on the multiprocessor 
is decreased. Specifically, it shows the system operating 
at 70MHz from a 1 .BV supply when the k>ad is increased 
in three steps to 140MHz. When the 2.7V supply is sta- 
ble, as shown by the supply voltage plot, the new task 
is enabled for execution. Some time thereafter accord- 
ing to FIG. 3. a task completes, which reduces the load 
on the multiprocessor. The reduced load permits lower- 
ing the clock frequency to 100MHz and lowering the 
supply voltage to 2.1V This, too, is accommodated in 
steps (two steps, this time), with Clk-L preceding Clk to 
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insure, again, that the PEs continue to operate properly 
while the supply voltage is decreased. 
[0022] Calibration block 120 can use one of several 
techniques to determine the voltage required to operate 
s the circuit at a given clock frequency. One technique is 
given in Koruda et al article. Recognizing that each of 
the PEs (101 -1 04) has a critrcal path which controls the 
ultimate speed of the PE, block 120 uses two copies of 
that portion of the PE circuit that contains the critical 
10 path of the PE circuit, with one of the copies being pur- 
posely designed to be just slightly slower. Both of the 
copies are operated from clock signal Clk and from the 
V^^local supply voltage of line 102, and that voltage is 
adjusted within block 120 so that, while operating at fre- 
ts quency Clk, the slightly slower PE fails to operate prop- 
erly while the other PE does operate properly. This guar- 
antees that the PE's are operating from a supply voltage 
that is "just above" the point at whteh they are likely to 
fail. Since the two critical path copies within element 120 
20 experience the same variations in temperature as do 
PEs 1 01 -1 04, the V^local supply voltage appropriately 
tracks the temperature variations as well as the different 
operating frequency specifications. 
[0023] The FIG. 2 system uses the operating system 
2S to react to variations in the system load. As more tasks 
are entered into the "to-do" list, the operating system of 
PE 1 00 computes the correct way to balance the addi- 
tional computational requirements and allocates the 
tasks to the processors. It then computes the required 
30 operating frequency. 

[0024] It is noted that the frequency is gradually pro- 
grammed into the system (as shown by the stepped 
changes in FIG. 3). This prevents excessive noise on 
the -local supply voltage and possible circuit failure. 
35 For example, if the system is operating at 50MHz and it 
needs to operate at 75MHz, the cfock frequency is in- 
creased slowly, perhaps even as slowly as In 5MHz in- 
crements. In addition, as indicated above, the V^y^ -local 
supply voltage is increased ahead of increasing the f re- 
40 quency of the clock the operates the PEs, when in- 
creased processing capability is desired, and the clock 
is reduced ahead of reducing the supply voltage when 
reduced processing capability will suffice. 
[0025] Of course, V^^-)oca\ can only be reduced so- 
45 far before the circuits start to fail, at which point the op- 
erating system employs gated clocking techniques to 
"shut down" PEs that are not needed. Of course, the fact 
that supply voltage V/'^-local varies as a function of load 
should be accounted for in the interface between the 
so PEs 101-104 and PE 100 (as well as in the interface 
between the multiprocessor chip and the "outside 
world". This is accomplished with level converter 150, 
which is quite conventional. It basically converts be- 
tween the voltage level of PEs 101-104 and the voltage 
ss levelof PE 100. 

[0026] The notion of adjusting operating frequency to 
load and adjusting supply voltage to track the operating 
frequency can be extended to allow each PE to have its 
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own supply voltage. The benefit of this approach for 
some applications becomes apparent when It is realized 
that the chip-wise voltage scaling is most effective when 
the load of the computation can be evenly distributed 
across all of the PEs. In some applications, however. 5 
one may encounter tasks that cannot be partitioned into 
concurrent evenly-loaded threads and. therefore, some 
RE within the multiprocessor would require a higher op- 
erating frequency and a higher operating voltage. This 
would require raising the frequency and voltage of the io 
entire multiprocessor chip. 

[0027] A separate power supply for each PE in a chip 
overcomes this limitation by allowing the operating sys- 
tem to independently program the lowest operating fre- 
quency and corresponding lowest supply voltage for is 
each PE. The architecture of such an arrangement is 
shown in FIG. 4. Each PE in FIG. 4 needs an independ- 
ent controller that performs the functions of PE 100 (ex- 
cept it does not divide tasks among PEs). As shown in 
FIG. 4. all of the controllers are embodied in a single 20 
controller 200, which may be just another processing el- 
ement of the integrated circuit that contains the other 
processing elements. Each processing element also re- 
quires a calibration circuit like circuit 120, and a voltage 
converter circuit like circuits 1 30 and 140. It also has a 2S 
PE 200 that assigns the tasks given to the multi-proc- 
essor chip of FIG. 4 among the PEs. 
[0028] It may be noted that if the frequencies at which 
the individual PEs operate differ from one another and 
from other elements within the system where the multi- 30 
processor chip is empbyed, there is an issue of syn- 
chronization that must be addressed. That is, a synchro- 
nization schema must be implemented when there is a 
need to communicate data between PEs (or with other 
system elements) that operate at different frequencies. 55 
It is possible to arrange the frequencies so that the col- 
lection of tasks that are assigned to the multiprocessor 
is completed at a predetermined time. In such a case, 
the synchronization problem of the multiprocessor vis- 
a-vis other elements within the system where the multi- 
processor is employed is minimized. However, that 
leaves the issue of synchronizing the exchange of data 
among the PEs of a multiprocessor chip. 
[0029] To effect such synchronization, each PE within 
the FIG. 4 arrangement is connection to an arrangement 
comprising elements 150 and 160. Level converter 150 
converts the vareble voltage swings of the PEs to a 
fixed level swing, and network 160 resolves the issue of 
different clock domains. 

[0030] The principles disclosed above for a multiproc- 
essor is extendible to other system arrangements. This 
includes systems with a plurality of separate processor 
elements that operate at different frequencies and op- 
erating voltages, as well as components that are not typ- 
ically thought of as processor elements. For example, 
there is a current often-used practice to maintaiiffi pro- 
gram code and data for different applteatlons of a per- 
sonal computer in a fast memory. As each new applica- 



tion is called, more information is stored in the fast mem- 
ory, until that memory is filled. Thereafter, when a new 
application is called, some of the information in the fast 
memory is discarded, some other information is placed 
in the slower hard drive, and the released memory is 
populated with the new application. It is possible to an- 
ticipate that memory stored in the fast memory is so old 
as to be unlikely to be accessed before a new applica- 
tion is called. When so anticipated, some of the fast 
memory can be released (storing some of the data that 
needed to be remembered) at a leisurely pace. That is, 
lower clock frequency can be employed in connection 
with the fast memory and the hard drive, with a corre- 
sponding lower supply voltage, resulting in an overall 
power saving in both the memory's operation and in the 
operation of the hard drive. 

[0031] The above descriptbn illustrated the principles 
of this invention, but it should be realized that a skilled 
artisan may easily make various modifications and im- 
provements that are within the scope of this invention 
as defined by the appended claims. For example, in one 
of the embodiment disclosed above all of the PEs in a 
multi-processor chip are subjected to a single controlled 
supply voltage. In another embodiment disclosed above 
each of the PEs in a multi-processor chip is subjected 
to its own, individually controlled, supply voltage. It 
should be realized, however, that a middle ground is al- 
so possible; i.e., the PEs of a multi-processor chip can 
be divided into groups, and each group of PEs can be 
arranged to operate from its own controlled supply volt- 
age. To cite another example, the FIG. 2 embodiment 
employs two almost identical critical path circuits to es- 
tablish the minimum supply voltage. Alternatively, the 
voltage may be set in accordance with a preset frequen- 
cy-voltage relationship that is not unlike the one depict- 
ed in FIG. 1 . 

[0032] It should also be noted that level converter 1 50 
is interposed in FIG. 2 between PE 100 and the other 
PES because PE 100 is operating off V^^^ PE 100 can 
also be operated off V^^ -local, in which case the level 
converter is interposed between PE 100 and the input/ 
output port of the FIG. 2 circuits that interacts with PE 
100. 

[0033] It should further be noted that the power supply 
circuit need not have any elements outside the circuit 
itself (as depicted in FIG. 2). A skilled artisan would be 
aware that circuit design exists that can be manufac- 
tured wholly within an integrated circuit. 
[0034] Yet another modification may be implemented 
by discarding the two-step application of voltages and 
frequencies of FIG. 3 when appropriate timing condi- 
tions are met. 



1. A method for controlling power consumption of a 
system sub-circuit comprising the steps of: 
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ascertaining time allotted tor carrying out an as- 
signed task; 

determining a lowest trequency at which or 
above which the sub-circuit must operate in or- 
der to complete execution of the assigned task s 
within the allotted time; and 
based on characteristics of the sub-circuit, set- 
ting a supply voltage that is applied to the sub- 
circuit to a lowest level that insures proper op- 
eration of the sub-circuit at the determined Ire- 'io 
quency. 

2. The method of claim 1, carried out in a multiproces 
sor sub-circuit, wherein said assigned task compris- 
es a plurality of sub-tasks, the method further com- 
prising the step of 

apportioning said sub-tasks among processors 
of said multiprocessor sub-circuit, resulting in 
one of said processors carrying the largest load 20 
of sub-tasks processing, compared to the sub- 
tasks processing load of others of said proces- 
sors, where 

said step of apportioning is executed prior to 
said step of determining, and 
said step of determining ascertains the lowest 
frequency at which the processor carrying the 
largest load of sub-tasks processing may oper- 
ate in order to complete its assigned sub-tasks 
processing within the albtted time. 

3. The method of claim 2 further comprising the steps 
of: 

determining a new lowest frequency, when a 55 
new task is assigned, at which or above which 
the sub-circuit must operate in order to com- 
plete execution of the assigned task within the 
allotted time; 

comparing the lowest frequency to the new tow- 40 
est frequency to determine whether a new op- 
erating frequency should be set for said sub- 
circuit; 

when said step of comparing detemnines that 
the new lowest frequency may be lower than ^5 
said lowest frequency, reducing the frequency 
at which said sub-circuit is set to operate and, 
thereafter, reducing the supply voltage that is 
applied to the sub-circuit; and 
when said step of comparing determines that ^0 
the new lowest frequency must be higher than 
said lowest frequency, increasing the supply 
voltage that is applied to the sub-circuit and, 
thereafter, increasing the frequency at which 
said sut>-circuit is set to operate to said new 55 
lowest frequency. 

4. A circuit that includes a processor, comprising: 



a controller, responsive to an applied task and 
to a specification for a time interval that may be 
devoted to executing said task, for developing 
a frequency of operation for said processor that 
is the lowest frequency of operation that allows 
completion of said applied task within said time 
interval; 

a calibration circuit responsive to said controller 
for directing creation of a supply voltage for said 
processor, and 

a power supply responsive to said calibration 
circuit, for developing said supply voltage for 
said processor and applying said supply volt- 
age to said processor; 

wherein said controller directs said processor 
to execute said task after said supply voltage is ap- 
plied to said processor and the trequency of a clock 
applied to said processor is set to said lowest fre- 
quency of operation that allows completion of said 
applied task within said time interval. 

5. The circuit of claim 4 further comprising a level con- 
verter circuit interposed between input/output ports 
of said circuit and said processor, to convert voltag- 
es levels passing between said input/output ports 
and said processor 

6. The circuit of claim 4 wherein said controller in- 
cludes a generator of clock signals that develops a 
first clock signal having a first frequency and applied 
to said calibratbn circuits, and a second ctock sig- 
nal having a second frequency applied to said proc- 
essor, wherein the second frequency can be set to 
said first frequency or to a lower frequency 

7. The circuit of claim 4 wherein said task includes a 
plurality of sub-tasks, said processor comprises a 
plurality of processing elements, said controller par- 
titions said sub-tasks among said processing ele- 
ment and devebps said frequency of operation for 
said processor based on said partitioning. 

8. The circuit of claim 7 wherein said controller devel- 
ops said frequency of operation tor said processor 
by evaluating the lowest frequency of operation for 
a most-burdened processing element that would 
still complete execution within said time interval, 
wherein the most-burdened processing element is 
a processing element to which sub-tasks are allo- 
cated that require, in the aggregate, the wosX 
processing time. 

9. The circuit of claim 4 wherein said processor com- 
prises N processing elements, said controller com- 
prises N controller sub-modules, said calibration 
circuit comprises N calibration circuit sub-modules, 
and said power supply comprises N power supply 
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modules, and wherein 

the l-th calibration circuit sub-module is re- 
sponsive to the i-th controller sub-nnodule and di- 
rects the i-th power supply module, the i-th power 
supply module provides power to the i-th process- s 
ing element and the i-th processing element is re- 
sponsive to the i-th controller sub-module. 

10. The apparatus of claim 9 further comprising a 
processing element for accepting said task and, io 
when Said task comprises a plurality of sub-tasks, 
for partitioning said sub-tasks among the N 
processing elements. 



11. The apparatus of claim 9 further comprising a level 
converter associated with each of said processing 
elements and coupled to Input/output ports of said 
associated processing elements. 



12. A circuit comprising: 



IS 



20 



a controller processing element; 
a plurality of task-handling processing ele- 
ments; 

a calibration circuit responsive to said controller 25 
processing element for directing creation of a 
supply voltage for said processor; and 
a power supply circuit, responsive to said cali- 
bration circuit, for developing a supply voltage 
for said task-handling processing elements; 30 
wherein said controller processing element di- 
rects said task-handling processing elements 
to execute tasks at a selected processing fre- 
quency. 

3S 

13. A method for operating a processor comprising the 
step of applying a supply voltage to said processor 
as a function of frequency necessary to operate 
said processor to complete an assigned task within 
an assigned time interval. 

14. The method of claim 13 wherein said function sub- 
stantially minimizes power consumption in said 
processor. 

45 
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